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Hyperelliptic Curves, L-polynomials, and Random Matrices 

Kir an S. Kedlaya and Andrew V. Sutherland 

Abstract. We analyze the distribution of unitarized L-polynomials L P (T) 
(as p varies) obtained from a hyperelliptic curve of genus g < 3 defined over 
Q. In the generic case, we find experimental agreement with a predicted cor- 
respondence (based on the Katz-Sarnak random matrix model) between the 
distributions of L P (T) and of characteristic polynomials of random matrices 
in the compact Lie group USp(2g). We then formulate an analogue of the 
Sato- Tate conjecture for curves of genus 2, in which the generic distribution is 
augmented by 22 exceptional distributions, each corresponding to a compact 
subgroup of USp(A). In every case, we exhibit a curve closely matching the 
proposed distribution, and can find no curves unaccounted for by our classifi- 
cation. 



1. Introduction 

For C a smooth projective curve of genus g defined over Q and each prime 
p where C has good reduction, we consider the polynomial L p (T), the numerator 
of the zeta function Z(C/¥ P ;T). This polynomial is intimately related to many 
arithmetic properties of the curve, appearing in the Euler product of the L-series 

L(C,s)=l[L p (p- s )- 1 , 
p 

the characteristic polynomial of the Frobenius endomorphism, 

X P (T) = f 2 «L p (T- 1 ), 
and the order of the group of F p -rational points on the Jacobian of C, 

#J(C/F P ) = L p (l). 

In genus 1, C is an elliptic curve, and L P (T) = pT 2 — a p T + 1 is determined 
by the trace of Frobenius, a p . The distribution of a p as p varies has been and 
remains a subject of considerable interest, forming the basis of several well known 
conjectures, including those of Lang- Trotter [36] and Sato- Tate [52]. Considerable 
progress has been made on these questions, particularly the latter, much of it quite 
recently [4, 5, 6, 21, 38]. 
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In genus 2, C is a hyperelliptic curve, and the study of L P (T) may be viewed 
as a natural generalization of these questions (we also consider hyperelliptic curves 
of genus 3). Our first objective is to understand the shape of the distribution of 
L p (T), which leads us to focus primarily on Sato- Tate type questions. 

The random matrix model developed by Katz and Sarnak provides a ready gen- 
eralization of the Sato- Tate conjecture to higher genera. They show (see Theorems 
10.1.18.3 and 10.8.2 in [27]) that over a universal family of hyperelliptic curves of 
genus g, the distribution of unitarized L-polynomials, L p (T) = L p (p~^' 2 T) , corre- 
sponds to the distribution of characteristic polynomials x(T) in the compact Lie 
group USp(2g) (the group of 2g x 2g complex matrices that are both unitary and 
symplectic). By also considering infinite compact subgroups of USp(2g), we are 
able to frame a generalization of the Sato- Tate conjecture applicable to any smooth 
curve defined over Q. 

To test this conjecture, we rely on a collection of highly efficient algorithms 
to compute L P (T), described by the authors in [28]. The performance of these 
algorithms has improved dramatically in recent years (due largely to an interest 
in cryptographic applications [13]). For a hyperelliptic curve, it is now entirely 
practical to compute L P (T) for all p < N (where the curve has good reduction), 
with TV on the order of 10 s in genus 2 and more than 10 7 in genus 3. Alternatively, 
for much smaller N (less than 10 ) one can perform similar computations with 10 10 
curves or more. 

We characterize the otherwise overwhelming abundance of data with moment 
statistics. If {x p } is a set of unitarized values derived from L p (T), say x p = a p /^/p, 
we compute the first several terms of the sequence 

(1) mfcp), m(xp), m(xl), 

where m(Xp) denotes the mean of x^ over p. Under the conjecture, the moment 
statistics converge, term by term, to the moment sequence 

(2) E[X], E[X 2 ], E[X 3 ], 

where the corresponding random variable X is derived from the characteristic poly- 
nomial x(T) °f a random matrix A, say X = tr(A). In all the cases of interest to 
us, the moments E[X™] exist and determine the distribution of X. Furthermore, 
they are integers. The distributions we encounter can typically be distinguished 
by the first eight terms of (2). It will be convenient to begin our sequences with 
EpT°] = 1. 

To apply this approach we must determine these moment sequences explicitly. 
This is an interesting problem in its own right, with applications to representation 
theory and combinatorics. The particular cases we consider include the elemen- 
tary and Newton symmetric functions (power sums) of the eigenvalues of a random 
matrix in U Sp{2g). We derive explicit formulae for the corresponding moment gen- 
erating functions. Our results intersect other work in this area, most notably that 
of Grabiner and Magyar [18], and also Eric Rains [42]. We take a somewhat differ- 
ent approach, relying on the Haar measure to encode the combinatorial structure 
of the group. 

With moment sequences for U Sp{2g) in hand, we then survey the distributions 
of L P (T). The case g = 1 is easily described, and we do so here. The polynomial 
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L P {T) = p + a\T + 1 is determined by the coefficient a\ = —a p /^/p, and there are 
two distributions of a\ that arise. 

For elliptic curves without complex multiplication, the moment statistics of a\ 
converge to the corresponding moment sequence in USp(2): 

1, 0, 1, 0, 2, 0, 5, 0, 14, 0, 42, 

whose (2n)th term is the nth Catalan number. Convergence follows from the Sato- 
Tate conjecture, which for curves with multiplicative reduction at some prime (al- 
most all curves) is now proven, thanks to the work of Clozel, Harris, Shephcrd- 
Barron, and Taylor [9, 21, 53] (see Mazur [38] for an overview). Testing at least 
one example of every curve with conductor less than 10 (including all the curves 
in Cremona's tables [11, 50]) revealed no apparent exceptions among curves with 
purely additive reduction. 

For elliptic curves with complex multiplication, the moment statistics of a\ 
converge instead to the sequence: 

1, 0, 1, 0, 3, 0, 10, 0, 35, 0, 126, 

whose (2n)th term is ( 2 ^) /2 for n > 0. This is the moment sequence of a compact 
subgroup of USp(2), specifically, the normalize! of SO (2) in SU{2) = USp{2). The 
elements with nonzero traces have uniformly distributed eigenvalue angles, and for 
elliptic curves with complex multiplication, convergence follows from a theorem of 
Deuring [12] and known equidistribution results for Hecke characters [35, Ch. XV]. 
The only other infinite compact subgroup of U Sp(2) (up to conjugacy) is SO(2), and 
its moment sequence does not appear to correspond to the moment statistics of any 
elliptic curve (such a curve would necessarily contradict the Sato- Tate conjecture). 

The main purpose of the present work is to undertake a similar study in genus 
2. We also lay some groundwork for genus 3, but consider only the case of a 
typical hyperelliptic curve. Already in genus 2 we find a much richer set of possible 
distributions. 

For the typical hyperelliptic curve in genus 2 (resp. 3), when computed for 
suitably large N, the moment statistics closely match the corresponding moment 
sequences in USp(A) (resp. USp(6)), as predicted. In genus 2 we can make a 
stronger statement. In a family of one million curves with randomly chosen coef- 
ficients, every single one appeared to have the L P (T) distribution of characteristic 
polynomials in USp(A). Under reasonable assumptions, we can reject alternative 
distributions with a high level of statistical confidence. 

To find exceptional distributions in genus 2 we must cast our net wider, search- 
ing very large families of curves with constrained coefficient values, as well as ex- 
amples taken from the literature. Such a search yielded over 30,000 nonisomorphic 
exceptional curves, but among these we find only 22 clearly distinct distributions, 
each with integer moments (Table 11, p. 29). For each of these distributions we 
are able to identify a specific compact subgroup H of U Sp{A) with a matching 
distribution of characteristic polynomials (Table 13, p. 34). The method we use to 
construct these subgroups is quite explicit, and a converse statement is very nearly 
true. Of the subgroups we can construct, only two do not correspond to a distri- 
bution we have found; we can rule out one of these and suspect the other can be 
ruled out also (§ 7.2). We believe we have accounted for all possible L-polynomial 
distributions of a genus 2 curve. 
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2. Some Motivating Examples 

We work throughout with smooth, projective, geometrically irreducible alge- 
braic curves defined over Q. Recall that for a curve C with good reduction at p, 
the zeta function Z(C /¥ p ;T) is defined by the formal power series 

Z{C/¥ p ;T) = cxp (f^N k T k /kj , 

where Nk counts the (projective) points on C over F p fc . From the seminal work of 
Emil Artin [3], we know that Z(C/¥ p ; T) is a rational function of the form 

Z(C/¥ r ;T) = ^—M£L— y 

where the monic polynomial L p (T) £ Z[T] has degree 2g (g is the genus of C) and 
constant coefficient 1. 

By the Riemann hypothesis for curves (proven by Weil [56]), the roots of L p {T) 
lie on a circle of radius p~ x l 2 about the origin of the complex plane. To study the 
distribution of L p {T) as p varies, we use the unitarized polynomial 

L p (T) = L p (p-^ 2 T), 

which has roots on the unit circle. As L p (T) is a real polynomial of even degree 
with L p (0) — 1, these roots occur in conjugate pairs. We may write 

L p (T) = T 2g + aiT 29 " 1 + a 2 T 2g - 2 + ■■■ + a 2 T 2 + a x T + 1. 

Since L p (T) has unitary roots, we know that 

(3) M < ( V 

and ask how is distributed within this interval as p varies. 

The next three pages show the distribution of ai for arbitrarily chosen curves 
of genus 1, 2, and 3 and various values of N. The coefficient ai is the negative sum 
of the roots of L p (T), and may be written as a\ = —a p /^/p, where a p is the trace 
of Frobcnius. 

Each graph represents a histogram of nearly tt(N) samples (one for each prime 
where C has good reduction) placed into approximately ^Jn(N) buckets which 
partition the interval [— 2g, 2g] determined by (3). The horizontal axis spans this 
interval, and the vertical axis has been suitably scaled, with the height of the 
uniform distribution, l/(4g), indicated by a dotted line. 
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The familiar semicircular shape in genus 1 is the Sato- Tate distribution. The 
examples in higher genera also appear to converge to distinct distributions. Pro- 
vided the curve is "typical" (a notion we will define momentarily) this distribution 
is the same for every curve of a given genus. Even in atypical cases, there is (em- 
pirically) a small distinct set of distributions that arise for a given genus. In genus 
1, only one exceptional distribution for a\ is known, exhibited by all curves with 
complex multiplication: 




Fig. 1: Distribution of a\ for y 2 = x 3 — 15x + 22. 

The central spike has area 1/2 (asymptotically), arising from the fact that a 
curve with CM- field Q[\/Z)] has a p = precisely when D is a not a quadratic 
residue in F p [12]. 

In higher genera, a richer set of exceptional distributions arises. Below is an 
example for a genus 2 curve whose Jacobian splits as the product of two elliptic 
curves (one with complex multiplication) . Histograms for several other exceptional 
genus 2 distributions are provided in Appendix II. 




Fig. 2: Distribution of oi for y 2 = x 5 + 20x 4 - 26x 3 + 20x 2 + x. 

3. A Generalized Sato- Tate Conjecture 

We wish to give a conjectural basis for these distributions, both in the typical 
and atypical cases. The formulation presented here follows the model developed by 
Katz and Sarnak [27] and relies heavily on additional detail provided by Nicholas 
Katz [26], whom we gratefully acknowledge. Most of the statements below readily 
generalize to abelian varieties, but we restrict ourselves to curves defined over Q. 

We begin with the Sato- Tate conjecture, which may be stated as follows: 

Conjecture 1 (Sato- Tate). For an elliptic curve without complex multiplica- 
tion, the distribution of the roots e l9 and e~ %e of L p (T) for p < N converges (as 
N — > oo ) to the distribution given by the measure [i — — sin 2 OdO over 9 £ [0, 7r] . 
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As noted earlier, this has been proven for elliptic curves with multiplicative 
reduction at some prime [21]. Using a% = —2 cos 6, one finds that 

Prfai <x] = — / a/4 - t 2 dt, 
2n J_ 2 V 

giving the familiar semicircular distribution. 

An equivalent formulation of Conjecture 1 is that the distribution of L P (T) 
corresponds to the distribution of the characteristic polynomial of a random ma- 
trix in USp(2). More generally, the Haar measure on USp(2g) provides a natural 
distribution of unitary symplectic polynomials of degree 2g: the eigenvalues of a 
unitary matrix lie on the unit circle and the symplectic condition ensures that they 
occur in conjugate pairs (giving a polynomial with real coefficients). Conversely, 
each unitary symplectic polynomial corresponds to a conjugacy class of matrices in 
US P (2g). 

Let the eigenvalues of a random matrix in USp(2g) be e ±tSl , . ■ ■ , e ±l9g , with 
6j G [0, 7r]. The joint probability density function on {9\,. .. ,9 g ) given by the Haar 
measure on USp(2g) is 

(4) f i(USp(2g)) = i (J] (2 cos 9 - 2 cos (9 fc )) ' JJ (- sin 2 9^) , 

as shown by Weyl [57, Thm. 7.8B, p. 218] (also sec [27, 5.0.4, p. 107]). For g = 1, 
we obtain the Sato- Tate distribution above. 

In view of the atypical examples in the previous section, we cannot expect 
every curve to achieve the distribution given by /j,(USp(2g)). One might impose a 
restriction comparable to that of Sato- Tate by requiring that Endc(</(C)) = Z, i.e. 
that the Jacobian have minimal endomorphism ring. While necessary, it is not clear 
that this restriction is sufficient in general. A stronger condition uses the £-&dic 
representation of Gal(Q/Q) induced by the Galois action on the Tate module Ti(C) 
(the inverse limit of the ^"-torsion subgroups of J(C)). Specifically, we require that 
the image of the representation 

P£ : Gal(Q/Q) -> Aut(T £ (C)) £* GL(2g,Z e ) 

be Zariski dense in GSp(2g, Zi) C GL(2g, Zi). We know from results of Serre ([46, 
Sec. 7, Thm 3] and [44, p. 104]) that if this is achieved for any £, then it holds for 
all t, and we say such a curve has large Galois image. 

Each of the conditions below suffices for C to have large Galois image. 

(1) C is a genus g curve with g odd, 2, or 6, and Endc(J(C)) = Z (Serre 
[46, 45]). This does not hold in genus 4 (Mumford [40]). 

(2) C is a hyperelliptic curve y 2 = f(x) with f(x) of degree n > 5 and the 
Galois group of f(x) is isomorphic to S n or A n (Zarhin [58]). 

(3) C is a genus g curve with good reduction outside of a set of primes S and 
the mod I reduction of the image of pi is equal to GSp(2g, Z/£Z) for some 
t > c(g,S) (Faltings [14], Serre [46, 44]). 

Condition 3 was suggested to us by Katz. The constant c(g, S) depends only on 
g and S. From Faltings [14, Thm. 5], we know that for a given g and S, there are 
only finitely many nonisomorphic Jacobians of genus g curves with good reduction 
outside of S. Each such curve has large Galois image if and only if for all £ > c 
the mod £ image of pi is GSp(2g, Z/£Z), for some constant c. By the results of 
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Serre cited above, it suffices to find one such £, and taking the maximum of c over 
the finite set of Jacobians gives the desired c(g,S). At present, effective bounds 
on c(g,S) are known only in genus 1 [10, 32]. Even without effective bounds, 
Condition 3 gives an easily computable heuristic that is useful in practice. 2 

We now consider the situation for a curve which does not have large Galois 
image. The simplest case is an elliptic curve with complex multiplication. For such 
a curve the distribution of a\ clearly does not match the distribution of traces in 
USp(2) (see Fig. 1 above). In particular, the density of primes for which a\ = 
~ a pl 1 \fv = is one half. There is, however, a compact subgroup of USp(2) which 
gives the correct distribution. Consider the subgroup 

{ ( cos 9 s'm9\ A cos (9 isin# \ „ ,„ „ ,1 

H contains SO(2) as a subgroup of index 2 (it is in fact the normalizer of SO(2) in 
USp(2)). The elements of H not in 50(2) all have zero traces, giving the desired 
density of 1/2. The Haar measure on SO(2) gives uniformly distributed eigenvalue 
angles, matching the distribution of nonzero traces for elliptic curves with complex 
multiplication. Note that H is disconnected. This is a common (but not universal) 
feature among the subgroups we wish to consider. 

For an elliptic curve with complex multiplication, the mod £ Galois image lies 
in the normalizer of a Cartan subgroup (see Lang [34, Thm 3.2]). For sufficiently 
large £, one finds that is in fact equal to the normalizer of a Cartan subgroup. There 
is then a correspondence with the subgroup H above, which is the normalizer of 
the Cartan subgroup SO (2) in SU{2) = USp{2). 

In general, we anticipate a relationship between the ^-adic Galois image of a 
curve C, call it Gi (a subgroup of GSp(2g,Zg)), and a compact subgroup H of 
USp(2g) whose distribution of characteristic polynomials matches the distribution 
of unitarized L-polynomials L p (T) of C. One can (conjccturally) describe this 
relationship, albeit in a nonexplicit manner. Briefly, one takes the Zariski closure 
of Gi through a series of embeddings: 

GSp(2g,Z e ) GSp(2g,Q e ) -> GSp(2g,Qe) GSp(2g,C). 

The last step is justified by the fact that Q e and C are algebraically closed fields 
containing Q of equal cardinality, hence isomorphic, and we choose a particular 
embedding of Q t in C. One then takes the intersection with Sp(2g), obtaining a 
reductive group over C. After dividing by y/p, the image of each Frobenius element 
lies in this intersection, as a diagonalizable matrix with unitary eigenvalues. We 
now consider a maximal compact subgroup of the intersection, H , lying in U Sp(2g). 
Each unitarized Frobenius clement is conjugate to some element of H (unique up 
to conjugacy in H) whose characteristic polynomial is L P (T). When Gi is Zariski 
dense in GSp(2g, Z^), we obtain H = USp(2g), but in general, H is some compact 
subgroup of U Sp{2g). 

There is an analogous construction involving the Mumford-Tate group MT(A) 
of an abelian variety A, which contains Gi. The result is the Hodge group Hg(A), 
corresponding to our subgroup H above. For simple abelian varieties of low dimen- 
sion (up to genus 5) the possibilities for Hg(^4) have been classified by Moonen and 



A fourth condition has recently been proven by Hall [20]. In the same paper, Kowalski proves 
that almost all hyperelliptic curves have large Galois image. 
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Zarhin [39]. 3 For genus 2 curves we consider the classification of H in Section 7, 
and give what we believe to be a complete list of the possibilities (most of these do 
not correspond to simple Jacobians). We note here that curves with nonisomorphic 
Hodge groups may have identical L-polynomial distributions, so the notions are not 
equivalent. 

We can now state the conjecture. 

Conjecture 2 (Generalized Sato- Tate). For a curve C of genus g, the dis- 
tribution of L P (T) converges to the distribution of characteristic polynomials x(T) 
in an infinite compact subgroup H C USp(2g). Equality holds if and only if C has 
large Galois image. 

We say that the subgroup H represents the L-polynomial distribution of C . 
4. Moment Sequences Attached to L-polynomials 

Let Pc(N) denote the set of primes p < N for which the curve C has good 
reduction. We may compute L p (T) for all p G Pc(N), and if x is a quantity derived 
from L p (T) (e.g., the coefficients at or some function of the dfc), we consider the 
mean value m(x n ) over p G Pc(N) as an approximation of the nth moment L[X ?l ] 
of a corresponding random variable X. Under Conjecture 2, we assume there is a 
compact subgroup H C USp(2g) which represents the L-polynomial distribution 
of C. If X is a real random variable defined as a polynomial of the eigenvalues of 
an element of -ff, it clearly has bounded support under the Haar measure on H 
(the eigenvalues lie on the unit circle). Therefore its moments all exist. Further, 
one may determine absolute bounds on X depending only on g, which we regard 
as fixed. Carleman's condition [31, p. 126] then implies that the moment sequence 
for X uniquely determines its distribution. We summarize this argument with the 
following proposition. 

Proposition 1. Under Conjecture 2, let H be a compact subgroup ofUSp(2g) 
which represents the L-polynomial distribution of a curve C. Let x p be a real-valued 
polynomial of the roots of L p (T), and let the random variable X be the corresponding 
polynomial of the eigenvalues of a random matrix in H. Then the moments of X 
exist and determine the distribution of X . For all nonnegative integers n, the mean 
value of x" over p G Pc(N) converges to E[X n ] as N — > oo. 

We now consider random variables X that are symmetric functions (polynomi- 
als) of the eigenvalues, with integer coefficients. In this case, it is easy to see that 
the moments of X must be integers. 

Proposition 2. Let V be a vector space of finite dimension over C. Let the 
random variable X be a symmetric polynomial with integer coefficients over the 
eigenvalues of a random matrix in a compact group G C GL{V), distributed with 
Haar measure. Then E[X n ] G Z for all nonnegative integers n. 

Proof. The sum of the eigenvalues, e\, is the trace of the standard repre- 
sentation V of G, and E[e{\ = J G tr(A)dA counts the multiplicity of the trivial 
representation in V, an integer. Similarly, the fcth symmetric function eu is the 
trace of the fcth exterior power A k V, hence E[ek] must be an integer. The product 
of the traces of two representations is the trace of their tensor product, hence E[e%] 



3 We thank David Zywina for bringing this to our attention. 
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is an integer, as are the moments of any product of elementary symmetric func- 
tions. Linearity of expectation implies that Z-linear combinations of products of 
elementary symmetric functions also have integer moments, and every symmetric 
polynomial with integer coefficients may be expressed in this way. □ 

Proposition 2 is quite useful in practice, particularly when H is unknown. We 
can "determine" ELY' 1 ] for small n by computing the mean of x™ up to a suitable 
bound N. The value obtained is purely heuristic of course, depending both on 
Conjecture 2 and an assumption about the rate of convergence. Nevertheless, the 
consistency of the results so obtained are quite compelling. 

For most curves (those with large Galois image) we expect H = USp(2g), and 
we may compute the distribution of X using the measure [i defined in (4). 

4.1. Moment Generating Functions in USp(2g). Let x(T) = OfcT fe 
be the characteristic polynomial of a random matrix in USp(2g) with eigenvalues 
e ±lBl , . . . , e ±l9s . Up to a sign (— l) fe , the clu are the elementary symmetric functions 
of the eigenvalues. We also define 

Sk = Y^i eH9i + e ~ fc ^ ) = 2 2 COS k9 > ' 

3 3 

the fcth power sums of the eigenvalues (Newton symmetric functions). 

We wish to compute the (integer) sequence M[X] = (1, E[X], E[X 2 ], . . .), where 
X is ak or Sk- We first consider Sk (including oi = — si). 4 We have 



(5) M[s k ]{n)=E[sl}= f (J>cob*0 3 



/'• 



where V = [0,7r] 9 denotes the volume of integration. If we expand (5) using the 
formula for fi in (4) , we need only consider univariate integrals of the form 

(6) C£ l (n) = - [ (2 cos fc60"(2 cos 60 m (2 sin 2 



We regard CJP - as a sequence indexed by n £ Z + (in fact, an integer sequence), 
and define the exponential generating function (egf ) of C™ by 



71=0 

Similarly, .Mfsfc], the egf of M[sfe], is the moment generating function of s&. We 
will compute C™ in terms of sequences B v , defined by 5 



(7) B v (n) = 



2 



where the binomial coefficient is zero when (n + is)/ 2 is not an integer in the 
interval [0,rt]. With this understanding, we allow is to take arbitrary values, with 
B v identically zero for v ^ Z. Letting B v {z) = Y^=o B v {n)^, we find that 

(8) B v {z) = T v {2z) = V for is e Z. 

' n\(n + is)\ 



^The other are addressed in Section 4.4 (for g < 3). 

^B v (n) counts paths of length n from to v on the real line using step set {±1}. 
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The function T v (z) is a hyperbolic Bessel function (of the first kind, of order v) 
[49, Ch. 49-50]. Hyperbolic Bessel functions are defined for nonintcgral values of v 
(and are not identically zero), so the condition v G Z in (8) should be duly noted. 

We can now state a concise formula for Ai[sk}- 

Theorem 1. Let Sk denote the kth power sum of the eigenvalues of a random 
element of USp(2g). The moment generating function of Sk is 

(9) M[s k ] = dct(c^- 2 ), 

gxg \ I 

where C™(z) is given by 

(!0) Ck = \j) ( 6 (2j-m)/fc ~ #(2j-m+2)/fc) , 

with B v defined as above. 

We postpone the proof until Section 4.2. 

The determinantal formula in Theorem 1 contains some redundancy (e.g., when 
<7 = 3 the term C\C\C\ appears twice), but we find the simple form of Theorem 1 
well suited to both hand and machine computation. For example, when g = 2 and 
k = 1 we have 

M[si] = C?C? - C\C\ = (Bo - B 2 )(B - B 4 ) - (Bi - B 3 ) 2 . 
From (15) of Section 4.3, we obtain the identity 6 
(11) M[ Sl ](2n) = c{n)c(n + 2) - c(n + l) 2 , 

where c(n) is the nth Catalan number. The odd moments are zero and the even 
moments form sequence A005700 in the On-line Encyclopedia of Integer Sequences 
(OEIS) [47]. A complete list of the sequences M[sfe] and g < 3 can be found in 
Section 4.3. 

The sequence A005700 = (1, 1, 3, 14, 84, 594, . . . ) is well known. It counts 
lattice paths of length 2n in Z 2 with step set {(±1, 0), (0, ±1)} that return to 
the origin and are constrained by x\ > xi > 0. In general, the sequence M[sx] 
counts returning lattice paths in Z s which remain in the region x\ > . . . > x g > 
0. This follows from a general result of Grabiner and Magyar which relates the 
decomposition of tensor powers of certain Lie group representations to lattice paths 
in a chamber of the associated Weyl group [18]: as in the proof of Proposition (2), 
interpret E[s r {\ as the multiplicity of the trivial representation in V® n , where V is 
the standard representation of USp(2g), then apply Theorem 2 of [18]. 

For the group USp(2g) and k = 1, our results intersect those of [18], where 
an equivalent determinantal formula is obtained by counting lattice paths in the 
Weyl chamber of the corresponding Lie algebra. By contrast, we compute A4[sjt] 
directly from the measure ^i(USp(2g)), using only elementary methods. The Haar 
measure, via the Weyl integration formula, effectively encodes the relevant combi- 
natorial content. Particularly when k > 1, this is simpler than a combinatorial or 



6 Thc similarity of C^Cj—CjCj and c(n)c(n+2) — c(n+l) 2 is somewhat misleading, both expressions 
involve Catalan numbers, but the terms do not correspond. 

7 As explained to us by Arun Ram, this equality may be interpreted in terms of crystal bases, 
which leads directly to analogues for groups other than USp(2g). 
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representation-theoretic approach. More generally, Haar measure, and moment se- 
quences in particular, can provide convenient access to the combinatorial structure 
of compact groups. 

Determinantal formulas of the type above arise in many combinatorial questions 
related to lattice paths and Young tableaux (see [16, 8] for examples). One might 
ask what the moment sequences for Sk count when k > 1. For g = 2, some answers 
may be found in the OEIS (see links in Table 2). 

Before proving Theorem 1, we note the following corollary. 

Corollary 1. For all k > 2g, the random variables Sk are identically dis- 
tributed with moment generating function M[sk] = (Ba) g . 

Proof. From (9) of Theorem 1, ./Vf [s&] is a polynomial expression in C™ with 
m < 2g-2. From (10), it follows that C k n = ("J 2 )B - L/™-i) B o for all k > m + 2, 
since B v is zero for nonintcgral v. Thus .M[sfc] is an integer multiple of Bq, and 
M[s k ] must be equal to B%, since M[s k }(0) = 1 = B Q {0). 8 □ 

The distribution of s k given by Corollary 1 corresponds to the trace of a random 
matrix under a uniform distribution of 6\, . . . , 9 g . This is a special case of a general 
phenomenon first noticed by Eric Rains, who has proven similar results for all the 
compact classical groups [41, 42]. The sudden transition to a fixed distribution is 
quite startling when first encountered; one might naively expect the distribution of 
Sk to gradually converge as k — > oo. There is, however, an elementary explanation 
(see the proof of Lemma 3 below) . 

4.2. Proof of Theorem 1. We rearrange the integral for M[sk]{n) = E[s^] 
to obtain a determinantal expression in C™. Lemma 3 then evaluates C™(n). 

Proof of Theorem 1. Let Wj = 2 cos k6j for 1 < j < g. Then s& = Y^, w j 
and the integral for M[sk\{n) in (5) becomes 

(12) M[s k ) (n) = / (E ^Y' 1 = E n » I II w 7 V 

Jv j v Jv j 

where V = [0, tt] 9 is the volume of integration, fi is the Haar measure on USp(2g) 
defined in (4), v ranges over vectors of g nonnegative integers, and n v = n v J. 

Now let Xj = 2 cos 9j and yj = 2 sin 2 9j so that (4) becomes 

g ' i<] 3 

By Lemma 1, we may write this as 

da) M=-^E^(<r)nM^)' 

where a ranges over permutations of {1, . . . , g}. From the definition of C™(n) in 
(6) we have C™(n.) = \ Jq w™x™yjd8j, for any j. Combining (12) and (13), 

M[s k ]{n) = E n± E (cT J_a («j)) = E »- 9 d x ct 3 (^"'("i) 



In fact, for k > 2g, factoring out Bo from .M[sfc] leaves the Hankel determinant of the sequence 
c(0), 0, c(l), 0, c(2), . . . , which is 1 (see [1]). 
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In terms of egfs (by Lemma 2), we then have the desired expression 

M{s k ] = dct (C l +' J - 2 ' 
gxg\ 

Applying Lemma 3 to evaluate the integral C™(n) completes the proof. □ 



Lemmas (1) and (2) are elementary facts; lacking a ready reference, we provide 
short proofs. 



Lemma 1. If r\, r g are indeterminates in a commutative ring, then 



' gy-9 

}<k a 

where a ranges over permutations of {1, ... ,g}. 

PROOF. Recall that Y[j<k( r j~ r k) is the Vandermonde determinant [29, p. 71] 

tor f 

l 3 3 



Define r a = J| . for any function a : {1, . . . , g} — > Z + , and for a permutation 
a let r (j = (r a{1) r a[g) ). Then Ylj<k( r 3 ~ r k) 2 is S iven b Y 

2 



(dot (rf 1 )) 2 = fesgn^K- 1 ") = ^sgn(^)sgn(0)r^ 

\ 7T / TT,4> 

sgn(7T0)r w =^sgn(7n/> )r ^ 



and the lemma is proven. □ 
Lemma 2. For 1 < i, j < m, let Ai,j £ C[[z]]. Then 



[n] det (A,i) = Y] ( ) dot ([ nj }Ai. 3 ), 

ni,...,n TO v ' ' 7 

where [n\!F denotes the coefficient of z n /n\ in T € C[[z]]. 
PROOF. For any T\, . . . ,T m <G C[[z]] we have 



[n _ 

j m,...,n 

It follows that 



[n] det = y]sgn(cr) V[n] TT^ CT (i) ,j 

a i j 

=E^)E E L n n )lE»M 

V'* 1 ' • ■ ■ ' 'W ^ 



z ni,...,n 

n 



S L ) ... ) nJix e l(NA >J ), 



fii,...,n n 

as desired. □ 



Lemma 3 computes the integral C™ 
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Lemma 3. Define C[ n (n) = \ f*(2 cos k9) n (2 cos(9) m (2 sin 2 9)d6, for positive 
integers k and nonnegative integers m and n, and let B v (n) = ( »+„ ) . Then 

CFH = E [^)( B i2j-m)/k(n) - B {2j _ m+2)/k (n)) 

for all nonnegative integers n. 

Proof. We may write C™(n) as 
1 



In terms of 8(t) = \ e lt0 d9, one finds that C™(n) is equal to 

EE(;)(7)W— ) + 2 i— ) 

-lT,T,(:)( n -) S ( k (2n-r) + 2j- m + 2) 



JEEQ ( T f)s(k(2n-r)+2 j -m-2). 



As C™{n) is a real number, we need only consider the real parts of the sums above. 
For real t, the real part of 8(t) is nonzero only when t = 0, in which case it is 1. 
Hence, in Ivcrson notation, $1(8 (t)) = [t = 0] holds for all t £ R. 9 

The real parts of the second two sums are equal, since we may replace j with 
to — j and r with n — r in the last sum and then apply [t = 0] = [—t = 0]. 
Interchanging the order of summation we obtain 

E E (™) (") fc ( 2r - n ) + 2 -? - m = °1 - [ fc ( 2n - r ) + 2 -? - m + 2 = °0 ■ 
We now note that 

' n 



E{rpr-n = »] = ^=B„(n), 
for all nonnegative integers n and arbitrary v. Thus we have 

Cl n (n) = J2 (jj ( B ( m -2j)/k(n) - B (m _ 2j _ 2)/k (n)). 

Applying the identity B v = completes the proof. □ 

4.3. Explicit Computation of M[s k ] in USp(2g), for g < 3. For g < 3, 

Theorem 1 gives: 

5=1: M[s k ] = Cl; 

g =2: M[s k ]=ClCl-ClCl; 

.9=3: M[s k ] = ClClCi + 2C^OfC| - C°CfCf - C^C, 4 - C^C 2 . 
We will compute these moment sequences explicitly. 



9 The function [P] is 1 when the boolean predicate P is true and otherwise, sec [30]. 
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IT 



k\m 





1 


2 


3 


4 


5 


6 


1 


C 


DC 


D 2 C 


D 3 C 


D 4 C 


D 5 C 


D 6 C 


2 


A 





C 





(D + 2)C 





(D + 2) 2 C 


3 


B 


-V 


B 


-V 


B + C 


—T> 


B + iC 


4 


B 





A 





2A 





4A + C 


5 


B 





B 


-V 


2B 


-3D 


5B 


6 


B 





B 





A + B 





4A + B 


7 


B 





B 





2B 


-V 


5B 


8 


B 





B 





2B 





5B-V 


9 


B 





B 





2B 





5B 



Table 1. Exponential Generating Functions C™. 



For convenience, define 

A = B - B 1 A126930=(l, -1, 2, -3, 6, -10, 20, -35, 70, -126, . . . ), 

B = B Q A126869=(1, 0, 2, 0, 6, 0, 20, 0, 70, 0, 252, . . . ), 

C = Bo - B 2 A126120=(1, 0, 1, 0, 2, 0, 5, 0, 14, 0, 42, . . . ), 

D = B l A138364=(0, 1, 0, 3, 0, 10, 0, 35, 0, 126, 0, . . . ), 

and let A, B,C, and T> denote the corresponding egfs. 

Lemma 4. Let D denote the derivative operator. 

1. Cf = D m C. 

2. Cf = (D + 2)( m - 2 )/ 2 C, for even m > 0. 

3. C™ = 0, for m odd and k even. 

4. q™ = C(m)B, for k = m + 1 > 1, or k > m + 2. 

PROOF. Recall from (10) of Theorem 1 that 

C ™ = 51 ( j ) ( 6 (2j-m)/fc - S (2j-m+2)/fc)- 

By Pascal's identity, we have 

(14) nB v =B v+1 +B v - V 

The proofs of statements 1 and 2 are then straightforward inductions on m. State- 
ments 3 and 4 follow immediately from Lemma 3. □ 

Applying Lemmas 3 and 4, we compute Table 1. From Table 1 and (9) of 
Theorem 1 we obtain a closed form for Ai [sj,] in terms of A, B, C, and T>. To deter- 
mine M[sk](n) we must compute linear combinations of multinomial convolutions 
of various Bj. This is reasonably efficient for small values of g and n, however we 
can speed up the process significantly with the following lemma. 
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Lemma 5. Let B v {z) = J2n=o B„(n)z n /n\, where B v (n) = (»+„)• Then 
B a (z)B b (z) = Y j B a+b {n)B a . b {n)^r 

for all a, 6 € Z. 

Proof. The coefficient of z n /nl on both sides of the equality count lattice 
paths from (0,0) to (a, b) in 1? with step set {(±1,0), (0,±1)}. This is immediate 
for the LHS. A simple bijective proof for the RHS appears in [19]. □ 

In genus 2, Lemma 5 gives us a closed form for _M[sfc](n) in terms of binomial 
coefficients (or Catalan numbers). For example, from 

M[ Sl ] = C\C\ - C\C\ = (Bo - B 2 )(B - B 4 ) - (Bi - B 3 ) 2 , 

we obtain 

M[si] =Bl-Bl-Bl + B 2 B 6 - B Q B 2 + 2B 2 B A - B B 6 
= (B a - B 2 )(2(B - B^ + B 2 - B 6 ) - (B - Bi) 2 . 
This may be expressed more compactly as 

(15) M[sJ(n) = C(n)C(n + 2) - C(n + 4). 

Here C(2n) = c(n) is the nth Catalan number, giving the identity (11) noted 
earlier. Similar formulas for the other M[sfc] in genus 2 are listed in Table 2. In 
higher genera we do not obtain a closed form, but computation is considerably 
faster with Lemma 5; in genus 4 we use 0(n) multiplications to compute M[sk](n), 
rather than 0(n 3 ). 

By Corollary 1, the sequences for k > 2g are all the same, so it suffices to 
consider k < 2g+l. For even k < 2g we find that E[sk] = — 1, hence we also consider 
M[sk + 1] = M[s^], the sequence of central moments. These may be obtained by 
computing the binomial convolution of Af[sjt] with the sequence (1,1,1,...). In 
genus 1 we obtain 

M[s+] = (1, 0, 1, 1, 3, 6, 15, 36, 91, 232, 603, 1585, . . . ) A005043, 
and in genus 2 we find 

M[s+] = (1, 0, 2, 1, 11, 16, 95, 232, 1085, 3460, 14820 . . . ) A138351, 

M[st] = (1, 0, 3, 1, 21, 26, 215, 493, 2821, 9040, 43695, . . . ) A138354. 

4.4. Explicit Computation of M[a/s] in USp(2g) for g < 3. To complete 
our study of moment sequences, we now consider the coefficients au of the character- 
istic polynomial x(T) of a random matrix in USp(2g). We have already addressed 
fti = —si. For k > 1, the Newton identities allow us to express in terms of s%, 
. . . , Sfc, however this does not allow us to easily compute M[a,fc] from the sequences 
M[sj] (the covariance among the sj is nonzero). Instead, we note that by writing 

(16) X (T) = JJ((T - e^)(T - e""*)) = J[(T 2 - 2 cos^T + 1), 

j j 



HYPERELLIPTIC CURVES, L-POLYNOMIALS, AND RANDOM MATRICES 



9 


k 


M[s k )(n) = E[s%] 


OEIS 


1 


1 


C(n) 








1, 0, 1, 0, 2, 0, 5, 0, 14, 0, 42, 0, 132, 0, 429, . . . 


A126120 




2 


A(n) 








1, -1, 2, -3, 6, -10, 20, -35, 70, -126, 252, -462, 924, . . . 


A126930 




3 


B(n) 








1, 0, 2, 0, 6, 0, 20, 0, 70, 0, 252, 0, 924, 0, 3432, . . . 


A126869 


2 


1 


C(n)C(n + A)-C(n + 2) 2 








1, 0, 1, 0, 3, 0, 14, 0, 84, 0, 594, 0, 4719, 0, 40898, . . . 


A138349 




2 


C{n)D(n + 1) - D(n)C(n + 1) 








1, -1, 3, -6, 20, -50, 175, -490, 1764, -5292, 19404, . . . 


A138350 




3 


B(n)C(n) 








1, 0, 2, 0, 12, 0, 100, 0, 980, 0, 10584, 0, 121968, . . . 


A000888* 




4 


B{nf - D{nf 








1, -1, 4, -9, 36, -100, 400, -1225, 4900, -15876, 63504, . . . 


A018224* 




5 


B{n) 2 








1, 0, 4, 0, 36, 0, 400, 0, 4900, 0, 63504, 0, 853776, . . . 


A002894* 


Table 2. Moment Sequences M[sk] for g < 2. 


X 




M[X]{n) = E[X n ] 


OEIS 


S\ 




1, 0, 1, 0, 3, 0, 15, 0, 104, 0, 909, 0, 9449, 0, 112398, . . . 


A138540 


S2 




1, -1, 3, -7, 24, -75, 285, -1036, 4242, -16926, 73206, . . . 


A138541 


s + 

■2 




1, 0, 2, 0, 11, 1, 95, 36, 1099, 982, 15792, 25070,... 


A138542 


S3 




1, 0, 3, 0, 26, 0, 345, 0, 5754, 0, 110586, 0, 2341548 . . . 


A138543 


S4 




1, -1, 4, -9, 42, -130, 660, -2415, 12810, -51786, 281736, . . 


. A138544 


s + 




1, 0, 3, 1, 27, 26, 385, 708, 7231, 20296, 164277, . . . 


A138545 


sg 




1, 0, 4, 0, 42, 0, 660, 0, 12810, 0, 281736, 0, 6727644, . . . 


A138546 


S6 




1, -1, 6, -15, 90, -310, 1860, -7455, 44730, -195426, . . . 


A138547 


• s (, 




1, 0, 5, 1, 63, 46, 1135, 1800, 25431, 66232, 666387, . . . 


A138548 


S7 




1, 0, 6, 0, 90, 0, 1860, 0, 44730, 0, 1172556, . . . 


A002896 



Table 3. Moment Sequences M[sk] and M[s+ fe ] for g = 3.* 



The OEIS sequence differs slightly. 
The notation X + denotes the random variable X + 1. 
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each ak may be expressed as a polynomial in 2 cos 9\ , . . . , 2 cos 6 g . It follows that 
we may compute M[ak] in terms of the sequences CJ™ already considered. The 
following proposition addresses the cases that arise for g < 3. 

Proposition 3. Let C(n) = BQ{n)—B 2 {n) as above and let ak be the coefficient 
of T k in x{T), the characteristic polynomial of a random matrix in USp(2g). For 
g = 2 we have 

(17) E[(a 2 - 2)"] = C(n)C(n + 2) - C(n + l) 2 , 
and if g = 3 then 

(18) M(a 2 -3)"] = V ( )detC(n-m+i + j-2). 

ni,n 2 ,n 3 x ' 

j4Zso /or g = 3 we have 

(19) *K] = £ ( ni ,nrn3,m) 2n " m 3 d x et 3* + ^ + i + ^ 2 )- 



ni ,712 ,ri3 ,m 



Proof. In genus 2, equation (16) gives a2 — 2 = (2 cos 0\) (2 cos #2) and we 
have 



7T /»7T 



(20) E[(a 2 -2) n }= / (2cos6>i)"(2cos6l2)V 

Jo Jo 

By Lemma 3, we note that 

(21) C(n + m) = CV l (n) = - f (2 cos 9) n (2 cos 6) m {2 sin 2 6)dB. 

7T Jo 

Expanding (20) and applying (21) yields (17). In genus 3 we write 

a 2 - 3 = (2cos6» 1 )(2cos6» 2 ) + (2cos6» 1 )(2cos6'3) + (2cos6» 2 )(2cos6» 3 ), 

and apply Lemma 1 to write the expanded integral for E[(a 2 — 3)™] in determinantal 
form to obtain (18) (we omit the details). For (19), note that 

a 3 = (2cos6li)(2cos6>2)(2cos(93) + 2(2cos6»i + 2cos6» 2 + 2cos6» 3 ), 

and proceed similarly. □ 

Taking the binomial convolution of M[a 2 — g] with the sequence (l,g,g 2 , . . .) 
gives the moment sequence for a 2 in genus g. One finds that E[a 2 ] = 1, hence 
we also consider the sequence of central moments, M[a 2 — 1]. Table 4 gives the 
complete set of moment sequences for ak in genus g < 3, including a± = — s\. 

5. Moment Statistics of Hyperelliptic Curves 

Having computed moment sequences attached to characteristic polynomials of 
random matrices in USp(2g), we now consider the corresponding moment statistics 
of a hyperelliptic curve. Under Conjecture 2, the latter should converge to the 
former, provided the curve has large Galois image. Tables 5-7 list moment statistics 
of a hyperelliptic curves known to have large Galois image. 

These tables were constructed by computing L p (T) to determine sample values 
of at and Sfc for eachp (the ak are the coefficients, the Sk are derived via the Newton 
identities). Central moment statistics of X = ak — E[ak] and X = Sk — E[sk] were 
then computed by averaging X n over all p < N where the curve has good reduction. 
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Q 
if 


X 


M[X](n) = E[X n ] 


OEIS 


1 


ai 


1, 0, 1, 0, 2, 0, 5, 0, 14, 0, 42, 0, 132, ... 


A126120 


2 


ai 


1, 0, 1, 0, 3, 0, 14, 0, 84, 0, 594, 0, 4719, . . . 


A138349 




(12 


1, 1, 2, 4, 10, 27, 82, 268, 940, 3476, 13448, . . . 


A138356 




a~ 


1, 0, 1, 0, 3, 1, 15, 15, 105, 190, 945, 2410,. . . 


A095922 


3 


CLl 


1, 0, 1, 0, 3, 0, 15, 0, 104, 0, 909, 0, 9449, . . . 


A138540 




a 2 


1, 1, 2, 5, 16, 62, 282, 1459, 8375, 52323 . . . 


A138549 




«2 


1, 0, 1, 1, 5, 16, 75, 366, 2016, 11936, 75678, . . 


. A138550 




«3 


1, 0, 2, 0, 23, 0, 684, 0, 34760, 0, 2493096, . . . 


A138551 



Table 4. Moment Sequences M[cik] and M[a 2k ] for g < 3J 



^Thc notation X denotes the random variable X — 1. 



The values of i?[a^] and -E[sfc] are as determined in the previous two sections, for 
k odd or k > 2g and ±1 otherwise. 

Using central moments, we find the moment statistic Mi (the mean of x) very 
close to zero (\Mi\ < 0.001), so we list only M 2 , . . . , M w for each x. Beneath each 
row the corresponding moments for USp(2g) are listed. Note that the value of N 
is not the same in each table (we are able to use larger N in lower genus), and the 
number of sample points is approximately ir(N) sw N/logN. 

Tables 8 and 9 show the progression of the moment statistics in genus 2 and 3 
as N increases, giving a rough indication of the rate of convergence and the degree 
of uncertainty in the higher moments. 

The agreement between the moment statistics listed in Tables 5-7 and the mo- 
ment sequences computed in Sections 4.3-4.4 is consistent with Conjecture 2. In- 
deed, on the basis of these results we can quite confidently reject certain alternative 
hypotheses. 

As an example, consider the fourth moment of a\ in genus 2. The value 
M 4 = 3.004 represents the average of 3,957,807 data points (« 7r(2 26 )). A uni- 
form distribution on ai would imply the mean value of af is greater than 50. The 
probability of then observing M4 = 3.004 over a sample of nearly four million data 
points is astronomically small. A uniform distribution on the eigenvalue angles gives 
a mean of 36 yielding a similarly improbable event. In fact, let us suppose only 
that af has a distribution with integer mean not equal to 3. We can then bound 
the standard deviation by 256 (since af < 256 2 ) and apply a Z-test to obtain an 
event probability less than one in a trillion. 



22 



KIRAN S. KEDLAYA AND ANDREW V. SUTHERLAND 



Y 


M n 




Ma 




M* 


Mr, 


Mg 


M n 


M-, n 

ML 10 


a i 


l.UUU 


n nnn 
U.UUU 


o nnn 
z.UUU 


n nnn 
U.UUU 


r nnn 
O.UUU 


n nm 
U.UU1 


14.UUU 


n nno 
U.UUz 


/i nnn 
4Z.UUU 




1 





2 





5 





14 





42 


,+ 

■2 


1.000 


1.000 


3.000 


6.000 


15.001 


36.003 


91.010 


232.03 


603.11 




1 


1 


3 


6 


15 


36 


91 


232 


603 


S3 


2.000 


0.000 


6.000 


0.000 


20.000 


0.000 


69.998 


0.000 


252.00 




2 





6 





20 





70 





252 


sa 


2.000 


0.000 


6.000 


0.000 


20.000 


0.000 


70.001 


-0.001 


252.00 




2 





6 





20 





70 





252 



Table 5. Central moment statistics of and Sk in genus 1, N = 2 35 . 
y 2 = x s + 314159x + 271828 



X 




M 2 


M 3 


M 4 


M 5 


M 6 


M 7 


M 8 


M 9 


Mio 




1 


,001 


-0.001 


3.004 


-0.006 


14.014 


-0.031 


84.041 


-0.178 


594.02 






1 





3 





14 





84 





594 


"2" 


1 


,001 


0.000 


3.003 


0.997 


15.013 


14.964 


105.10 


190.00 


947.38 






1 





3 


1 


15 


15 


105 


190 


945 


s .+ 

■2 


2 


,001 


1.002 


11.014 


16.044 


95.247 


232.90 


1089.3 


3476.4 


14891 






2 


1 


11 


16 


95 


232 


1085 


3460 


14820 


S3 


2 


,002 


0.001 


12.014 


-0.001 


100.14 


-0.147 


981.54 


-2.850 


10603 






2 





12 





100 





980 





10584 


s 4 


3 


,004 


1.010 


21.049 


26.150 


215.66 


500.32 


2830.6 


9075.6 


43836 






3 


1 


21 


26 


215 


493 


2821 


9040 


43695 


S5 


3 


996 


-0.015 


35.958 


-0.211 


399.62 


-3.152 


4897.2 


-47.602 


63492 






4 





36 





400 





4900 





63504 


S6 


3 


,999 


-0.002 


35.983 


-0.023 


399.81 


-0.490 


4898.0 


-9.460 


63487 






4 





36 





400 





4900 





63504 



Table 6. Central moment statistics of au and Sk in genus 2, N = 2 26 . 
y 2 = x 5 + 314159x 3 + 271828a; 2 + 1644934x + 57721566. 
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X 


M 2 


M 3 


M 4 


Mg 


M 6 


M 7 


M s 


Mg 


A/10 


a i 


0.999 
1 


-0.007 



2.995 
3 


-0.046 



14.968 
15 


-0.343 



103.76 
104 


-2.723 



906.67 
909 


"2 


0.999 
1 


0.996 
1 


4.992 

5 


15.982 
16 


75.049 
75 


366.87 
366 


2023.7 
2016 


11990 
11936 


75992 
75678 




1.996 
2 


-0.034 



22.940 
23 


-0.950 



684.23 
684 


-22.334 



34938 
34760 


2360.8 



2512126 
2493096 


4 


2.000 
2 


0.001 



10.998 
11 


0.969 
1 


94.977 
95 


35.182 
36 


1099.5 
1099 


966.05 
982 


15812 
15792 


S3 


2.996 
3 


0.008 



25.953 
26 


0.129 



344.64 
345 


2.935 



5759.4 
5754 


73.138 



111003 
110586 


4 


3.002 
3 


0.980 
1 


27.023 
27 


25.574 
26 


384.80 
385 


697.45 
708 


7207.4 
7231 


20004 
20296 


163235 
164277 


S5 


3.995 
4 


-0.036 



41.906 
42 


-0.719 



658.28 
660 


-16.625 



12776 
12810 


-428.23 



281027 
281736 


6 6 


5.001 
5 


0.968 
1 


63.005 
63 


45.334 
46 


1134.1 
1135 


1782.0 
1800 


25376 
25431 


65650 
66232 


663829 
666387 


S7 


6.000 
6 


0.002 



90.015 
90 


0.356 



1860.6 
1860 


13.010 



44746 
44730 


380.75 



1172844 
1172556 



Table 7. Central moment statistics of and Sk in genus 3, N = 2 25 . 
y 2 = x 7 + 314159s 5 + 271828s 4 + 1644934s 3 + 57721566s 2 + 1618034s + 141021. 



24 



KIRAN S. KEDLAYA AND ANDREW V. SUTHERLAND 



Af 
i \ 




M-. 




1V12 


























2 11 


-0 


.071 


1 


.031 


-0. 


,276 


3 


,167 


-2 


.295 


15. 


,250 


-21, 


.145 


97 


,499 


2 12 


-0 


.036 


1 


.112 


-0. 


087 


3 


,565 


-0 


.475 


17 


,251 


-4, 


539 


105 


,082 


2 13 


-0 


.067 


1 


.085 


-0. 


,249 


3 


,407 


-1 


.567 


16. 


,537 


-12 


.893 


103 


,344 


2 14 


-0 


.046 


1 


.029 


-0 


232 


3 


,181 


-1 


.529 


15. 


795 


-13 


.309 


104, 


,558 


2 15 


-0 


.044 


1 


.031 


-0 


,121 


3 


,256 


-0 


.428 


16. 


325 


-2 


396 


107, 


,173 


2 16 


-0 


.025 


1 


.022 


-0, 


069 


3 


,143 


-0 


.251 


15. 


251 


-1 


673 


96, 


,837 




-U 


.016 


-1 

1 


m 1 
.011 


-U 


U41 


n 
6 


,0(y 


-U 


on a 
.204 


i i 

14 


5»4 


-1, 


,717 


66 


8/1 


2 18 


-0 


.009 


1 


.002 


-0, 


022 


3 


.041 


-0 


.138 


14. 


,441 


-1 


456 


88 


636 


2 19 


-0 


.002 


1 


.003 


-0 


013 


3 


.031 


-0 


.108 


14. 


259 


-1 


023 


86 


,288 


2 20 





001 





.998 





,001 


3 


.003 


-0 


.041 


14. 


126 


-0, 


,687 


85 


,815 


2 21 


-0 


.000 


1 


.003 


-0. 


002 


3 


.016 


-0 


.045 


14. 


,088 


-0. 


,577 


84 


,746 


2 22 





002 


1 


.002 





,009 


3 


.013 





.037 


14. 


,058 





,101 


84 


,166 


2 23 





,001 


1 


.001 





,002 


3 


.006 





.001 


13. 


999 


-0, 


,103 


83 


,715 


2 24 





000 


1 


.001 





,001 


3 


.002 





.008 


13. 


964 


0, 


036 


83 


346 


2 25 





000 


1 


.000 


-0 


,000 


2 


.995 


-0 


.010 


13. 


950 


-0 


,120 


83 


500 


2 26 


0. 


.000 


1 


.001 


-0. 


,001 


3 


.004 


-0 


.006 


14, 


,014 


-0. 


,031 


84, 


,041 



Table 8. Convergence of moment statistics for a\ in genus 2 as iV increases. 



y 2 = x + 314159s 3 + 271828s 2 + 1644934s + 57721566. 



N 




Mi 




M 2 




M 3 




M A 




M 5 




M 6 




M 7 




M 8 


2 11 





.033 





.942 





,204 


2, 


,611 


1 


313 


11 


365 


9 


.198 


62 


.068 


2 12 





.009 





.921 





,092 


2 


,447 





649 


10 


,149 


4 


.617 


52 


.838 


2 13 





.015 





.961 





,075 


2 


676 





.535 


11 


,971 


4 


.345 


69 


.641 


2 14 


-0 


.011 





.983 


-0 


,060 


2 


,893 


-0 


245 


14 


316 





.704 


99 


.690 


2 15 


-0 


.005 


1 


Oil 


-0 


.018 


3 


,134 


-0 


067 


16 


,286 





.836 


116 


.675 


2 16 


-0 


.017 


1 


.007 


-0 


.054 


3 


,154 


-0 


.105 


16 


,813 


2 


.952 


127 


.212 


2 17 


-0 


.006 





.993 


-0 


.024 


3 


,027 


-0 


.041 


15 


,431 


1 


.622 


109 


.717 


2 18 


-0 


.005 





.996 


-0 


.026 


3 


006 


-0 


.110 


15 


,196 





.239 


106 


.901 


2 19 


-0 


.001 





.999 


-0 


.013 


2 


,985 


-0, 


,087 


14 


793 


-0 


.418 


101 


.662 


2 20 





.000 





.989 


-0 


.007 


2 


934 


-0 


,072 


14 


440 


-0 


.759 


98 


.109 


2 21 





.003 





.997 





.003 


2 


979 


-0 


,017 


14 


,796 


-0 


.562 


101 


.690 


2 22 





.002 





.999 





.005 


3 


003 





,038 


15 


,098 





.446 


105 


.733 


2 24 





.000 


1 


.001 





,001 


3 


,015 


-0 


,005 


15 


,138 


-0 


.102 


105 


.418 


2 24 





.000 





.999 


-0 


,004 


2 


,990 


-0 


,043 


14 


,916 


-0 


.397 


103 


.271 


2 25 





.000 





.999 


-0 


,007 


2 


,995 


-0. 


,046 


14 


968 


-0 


.343 


103 


.755 



Table 9. Convergence of moment statistics for a\ in genus 3 as iV increases. 
y 2 = x 7 + 314159s 5 + 271828s 4 + 1644934s 3 + 57721566s 2 + 1618034s + 141021. 
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# 


N 


% 


A Mi 


AM 2 


AM 3 


AM 4 


AM 5 


AM 6 


AM 7 


AM 8 


10 6 


2 16 


50 


0.008 


0.012 


0.031 


0.072 


0.204 


0.563 


1.685 


5.104 






90 


0.020 


0.029 


0.076 


0.177 


0.497 


1.371 


4.120 


12.397 






99 


0.032 


0.045 


0.120 


0.277 


0.781 


2.156 


6.512 


19.633 


10 4 


2 20 


50 


0.002 


0.003 


0.009 


0.020 


0.057 


0.159 


0.470 


1.433 






90 


0.006 


0.008 


0.021 


0.049 


0.138 


0.384 


1.154 


3.485 






99 


0.009 


0.013 


0.033 


0.078 


0.214 


0.604 


1.801 


5.432 


10 2 


2 24 


50 


0.001 


0.001 


0.002 


0.005 


0.017 


0.044 


0.138 


0.424 






90 


0.002 


0.002 


0.006 


0.013 


0.035 


0.101 


0.277 


0.933 






99 


0.002 


0.003 


0.008 


0.019 


0.054 


0.165 


0.543 


1.519 



Table 10. Moment deviations for families of random genus 2 curves. 



After perusing the data in Tables 5-9, several questions come to mind: 

(1) What is the rate of convergence as N increases? 

(2) Are these results representative of typical curves? 

(3) Can we distinguish exceptional distributions that may arise? 

Question 1 has been considered in genus 1 (but not. to our knowledge, in 
higher genera). For an elliptic curve without complex multiplication, the conjec- 
tured discrepancy between the observed distribution and the Sato- Tate prediction 
is 0(N~ 1 / 2 ) (see Conjecture 1 of [2] or Conjecture 2.2 of [38]). This conjecture 
implies that the generalized Ricmann Hypothesis then holds for the L-series of the 
curve [2, Theorem 2]. We will not attempt to address this question here, other 
than noting that the figures listed in Tables 8 and 9 are not inconsistent with a 
convergence rate of 0{N~ 1 / 2 ). 

5.1. Random Families of Genus 2 Curves. We can say more about Ques- 
tions 2 and 3, at least in genus 2. To address Question 2 we tested over a million 
randomly generated curves of the form 

y 2 = x 5 + f 4 x 4 + f 3 x 3 + f 2 x 2 + f r x + f , 

with the integer coefficients /o, . . . , /s obtained from a uniform distribution on the 
interval [-2 63 + 1,2 63 - 1]. 

Table 10 describes the distribution of moment statistics for a\ over three sets of 
computations: one million curves with N = 2 16 , ten thousand curves with N = 2 20 , 
and one hundred curves with N = 2 24 . The rows list bounds on the deviation from 
the moment sequence for oi in USp(2g) that apply to m% of the curves, with 
m equal to 50, 90, or 99. One sees close agreement with the predicted moment 
statistics, with AM n decreasing as TV increases. The maximum deviation in M4 
observed for any curve was 0.56 with N = 2 16 . 

We also looked for exceptional distributions among the outliers, considering 
the possibility that one or more curves in our random sample might not have large 
Galois image. From our initial family of one million random curves we selected one 
thousand curves whose moment statistics showed the greatest deviation from the 
predicted values. We recomputed the ai moment statistics of these curves, with 
the bound N increased from 2 16 to 2 20 . In each and every case, we saw convergence 
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toward the moment sequence for a\ in USp(A), 

(A138349) M[ai] = 1, 0, 1, 0, 3, 0, 14, 0, 84, 0, 594, 

as predicted by Conjecture 2 for curves with large Galois image. An additional test 
of one hundred of the most deviant curves from within this group with N = 2 24 
yielded further convergence, with AM n < 1 for all n < 8. This is strong evidence 
that all the curves in our original random sample had large Galois image; every 
exceptional distribution we have found for a\ in genus 2 has AM 8 > 200. 

One may ask whether convergence of the a\ moment statistics to M[a{\ is 
enough to guarantee that the i-polynomial distribution of the curve is represented 
by USp(2g), since we have not examined the distribution of a,2- If we assume the 
distribution of L p (T) is given by some infinite compact subgroup of U Sp{A) (as in 
Conjecture 2), then it suffices to consider a\. In fact, under this assumption, much 
more is true: if the fourth moment statistic of a\ converges to 3, then the distri- 
bution of L P (T) converges to the distribution of x(T) in USp(A). This remarkable 
phenomenon is a consequence of Larsen's alternative [37, 25]. 

5.2. Larsen's Alternative. To apply Larsen's alternative we need to briefly 
introduce a representation theoretic definition of "moment" which will turn out to 
be equivalent to the usual statistical moment in the case of interest to us. Here we 
parallel the presentation in [25, Section 1.1], but assume G to be compact rather 
than reductive. Let V be a complex vector space of dimension at least two and 
G C GL(V) a compact group. Define 

(22) M a , b (G, V) = dim c (V® a ® V® b ) G ', 

and set M2„(G, V) = M n , n (G,V). Let x(A) = tr(A) denote the character of V as 
a G- module (the standard representation of G). We then have 

M 2n {G,V) = [ x {A) n x{ATdA = [ \ X {A)\ 2n dA. 

J G JG 

We now specialize to the case V = C 2g and suppose G C USp(2g). Then 
M 2n (G,V) = f (ti-(A)) 2n dA = E[(tr(A)) 2n ] = M[oi](2n), 

JG 

where ai = —tr(A) and M\ai\(n) = E[ai] as usual. We can now state Larsen's 
alternative as it applies to our present situation. 

Theorem 2 (Larsen's Alternative). Let V a complex vector space of even 
dimension greater than 2 and suppose G is a compact subgroup of USp(V). If 
M&(G, V) = 3, then either G is finite or G = USp(V). 

This is directly analogous to Part 3 of Theorem 1.1.6 in [25], and the proof is 
the same. 

COROLLARY 2. Let C be a curve of genus g > 1. Under Conjecture 2, the 
distribution of L p (T) converges to the distribution o/x(T) in USp(2g) if and only 
if the fourth moment statistic of ai converges to 3. 

The corollary provides a wonderfully effective way for us to distinguish curves 
with exceptional L P (T) distributions. 
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6. Exceptional L P (T) Distributions in Genus 2 

While wc were unable to find any exceptional L P (T) distributions among ran- 
dom genus 2 curves with large coefficients, if one restricts the size of the coefficients 
such cases are readily found. We tested every curve of the form y 2 = f(x), with 
f{x) a monic polynomial of degree 5 with coefficients in the interval [—64, 64], more 
than 2 35 curves. As not every hyperelliptic curve of genus 2 can be put in this 
form, we also included curves with f(x) of degree 6 (not necessarily monic) and 
coefficients in the interval [—16,16]. With such a large set of curves to test, we 
necessarily used a much smaller value of N, approximately 2 12 . We computed the 
moment statistics of a\ using parallel point-counting techniques described in [28] 
to process 32 curves at once. 

To identify exceptional curves, we applied a heuristic filter to bound the devia- 
tion of the fourth and sixth moment statistics from the USp(2g) values M[ai] (4) = 3 
and M [ai](6) = 14. By Larsen's alternative, it suffices only to consider the fourth 
moment, however we found the sixth moment also useful as a distinguishing metric: 
the smallest sixth moment observed among any of the exceptional distributions was 
35, compared to 14 in the typical case. A combination of the two moments proved 
to be most effective. 

Searching for a small subset of exceptional curves in a large family using a 
statistical test necessarily generates many false positives: nonexceptional curves 
which happen to deviate significantly from the USp(2g) distribution for p < N . 
The filter criteria were tuned to limit this, at the risk of introducing more false 
negatives (unnoticed exceptional curves). After filtering the entire family with 
TV ?s 2 12 , the remaining curves were filtered again with N = 2 16 to remove false 
positives. Finally, we restricted the resulting list to curves with distinct Igusa 
invariants [24], leaving a set of some 30,000 nonisomorphic curves with (apparently) 
exceptional distributions. 

One additional criterion used to distinguish distributions was the ratio z(C, N) 
of zero traces, that is, the proportion of primes p for which a p = 0, among p < N 
where C has good reduction. For a typical curve, z(C, N) — > as N — > oo, but for 
many exceptional distributions, z(c, TV) converges to a nonzero rational number. In 
most cases where this arises, one can readily compute 

(23) Jim z(C,N)=z(C), 

using the Hasse-Witt matrix. This is described in detail in [51], and the following 
is a typical example. One can show that the curve y 2 = x e + 2 has a p = unless p 
is of the form p = 6n + 1 , in which case 

(24) a p = PjM 2" (2™ + 1) mod p. 

It follows that for p = 6n + 1 we have a p = if and only if 2" = —1 mod p. 
The integer 2™ = 2( p_1 )/ 6 is necessarily a sixth root of unity mod p, and exactly 
one of these is congruent to -1. By the Cebotarev density theorem, this occurs for 
a set of density 1/6 among primes of the form p — 6n + 1. Combine this with the 
fact that a p — when p is not of this form and we obtain z(C) = 7/12. 
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Under Conjecture 2, one can show that for any curve C (of arbitrary genus), 
z(C) must exist and is a rational number, however we need not assume this here. 10 
For each nonzero z(C) in Table 11, we have identified a specific curve exhibit- 
ing the distribution for which one can show of limjv— ►«> ^(C, N) = z(C). Typi- 
cally, the Hasse-Witt matrix gives a lower bound on nminfjv— >oo z(C,N), and by 
computing G[£], the mod £ image of the Galois representation in GSp(2g,Z/£z), 
one obtains the density of nonzero traces mod t, establishing an upper bound on 
limsupjy^^ z(c,N). It is generally not difficult to find an £ for which these two 
bounds are equal (the Cebotarev density theorem is invoked on both sides of the 
argument). 

After sorting the sequences of moment statistics and considering the values 
of z(C) among our set of more than 30,000 exceptional curves, we were able to 
identify only 22 distributions that were clearly distinct (within the precision of 
our computations). We also tested a wide range of genus 2 curves taken from the 
literature [7, 15, 17, 22, 23, 33, 43, 48, 54, 55], most with coefficient values 
outside the range of our search family. In every case the a\ moment statistics 
appeared to match one of our previously identified distributions. Conversely, several 
of the distributions found in our search did not arise among the curves we tested 
from the literature. 

Table 11 lists the 23 distinct distributions for a\ we found for genus two curves, 
including the typical case, which is listed first. The value of z(C) and the first 
six moments of a\ suffice to distinguish every distribution we have found. We 
also list the eighth moment statistic, which, while not accurate to the nearest 
integer, is almost certainly within one percent of the "true" value. We list only 
the moment statistics of a± . Histograms of the first twelve a\ distributions can be 
found in Appendix II. Additional a\ histograms, along with moment statistics and 
histograms for a 2 and Sk are available at http://math.mit.edu/~drew/. 

The third distribution in Table 11 went unnoticed in our initial analysis (we 
later found several examples that had been misclassified) and is not a curve taken 
from the literature. We constructed the curve 

(C) y 2 = x 5 + 20x 4 - 26x 3 + 20x 2 + x 

to have a split Jacobian, isogenous to the product of the elliptic curve 

(Ei) y 2 = x 3 - llx + 14, 

which has complex multiplication, and the elliptic curve 

(E 2 ) y 2 = x 3 + Ax 2 - 4x, 

which does not. For every p where C has good reduction, the trace of Frobenius 
is simply the sum of the traces of E\ and E 2 . As E\ and E 2 are not isogenous 
(over C), we expect their a\ distributions to be uncorrelated. 11 It follows that the 
moment sequence of a\ for the curve C is simply the binomial convolution of the 
moment sequences of a\ for the curves E\ and E 2 . Thus we expect the moment 



"A compact subgroup of USp(2g) has a finite number of connected components. The density of 
zero trace elements must be zero or one on each component. 

The genus 1 traces may be correlated in a way that does not impact ai = —a p /^/p, e.g., both 
curves might have the property that p mod 3 determines a p mod 3. 
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# 


z[C) 


1 f 

M 2 


M4 


71 T 


ft r 


T(x) 




1 





1 


3 


14 


84 


x 5 + 


x + l 


2 





2 


10 


70 


588* 


x 5 — 


2x A + x 3 + 2x - 4 


3 





2 


11 


90 


888* 


x 5 + 


20a: 4 - 26a; 3 + 20a; 2 + x 


4 





2 


12 


110 


1203* 


X 5 + 


4x 4 + 3x 3 - x 2 - x 


■5 





4 


32 


320 


3581* 


X 5 + 


7x 3 + 32x 2 + 45a; + 50 


6 


1/6 


2 


12 


100 


979* 


x 5 — 


5x 3 — 5x 2 — x 


7 


1/4 


2 


12 


100 


1008* 


x 5 + 


2x 4 + 2x 2 - x 


8 


1/4 


2 


12 


110 


1257* 


x 5 — 


4x 4 - 2x 3 - 4a; 2 + x 


9 


1/2 


1 


5 


35 


293* 


x 5 - 


2x 4 + llx 3 + Ax 2 +Ax 


10 


1/2 


1 


6 


55 


601* 


x 5 — 


2x 4 - 3x 3 + 2x 2 + 8x 


11 


1/2 


2 


16 


160 


1789* 


X 5 + 


X 3 + X 


12 


1/2 


2 


18 


220 


3005* 


x 5 — 


3x 4 + 19x 3 + 4a- 2 + 56a; - 12 


13 


1/2 


4 


48 


640 


8949* 


x e + 


1 


14 


7/12 


1 


6 


50 


489* 


x 5 — 


4a; 4 - 3a: 3 - 7a: 2 - 2a; - 3 


15 


7/12 


2 


18 


200 


2446* 


x e + 


2 


1G 


5/8 


1 


G 


50 


502* 


X 5 + 


a; 3 + 2a; 


17 


5/8 


2 


18 


200 


2515* 


x 5 — 


10a; 4 + 50a; 2 - 25a; 


18 


3/4 


1 


8 


80 


894* 


x 5 — 


2a; 3 - x 


19 


3/4 


1 


9 


100 


1222* 


X 5 — 


1 


20 


3/4 


1 


9 


110 


1501* 


llx 6 


+ llx 3 -4 


21 


3/4 


2 


24 


320 


4474* 


X 5 + 


:r 


22 


13/16 


1 


9 


100 


1254* 


x 5 + 


3a; 


23 


7/8 


1 


12 


160 


2237* 


X 5 + 


2a: 



Table 11. Moments of ai for genus 2 curves y 2 = f(x) with N = 2 



Column z(C) is the density of zero traces (a p values). The starred values indicate un- 
certainty in the eighth moment statistic. In each case, if Tg = limjv^oo Mg, we estimate 
that —0.005 < 1 — M$/T$ < 0.01 with very high probability (the larger uncertainty on the 
positive side is primarily due to an observed excess of zero traces for small values of N, 
we expect Mg < Tg in most cases). See Table 13 for predicted values of Tg and Tio. 

statistics of a\ for C to converge to 

(A138552*) 1, 0, 2, 0, 11, 0, 90, 0, 889, 0, 9723, 

which is the binomial convolution of the sequences (1, 0, 1, 0, 3, 0, 10, 0, 35, . . .) and 
(1, 0, 1, 0, 2, 0, 5, 0, 14, . . .) mentioned in the introduction. These are the a\ moment 
sequences of elliptic curves with and without complex multiplication. In terms of 
moment generating functions, we simply have 



Mc[ai] = ME 1 [ai}M E2 [ai). 
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Provided the covariance between the a\ distributions of E\ and E2 is zero, 
one can prove the ai moment statistics of C converge to the sequence above using 
known results for genus 1 curves (note that E\ has complex multiplication and E2 
has multiplicative reduction at p = 107). 

Many of the distributions in Table 11 can be obtained from genus 1 moment 
sequences by constructing an appropriate genus 2 curve with split Jacobian, as 
shown in the next section. It is important to note that a curve whose Jacobian 
is simple (not split over Q) may still have an L-polynomial distribution matching 
that of a split Jacobian. Distribution #2, for example, corresponds to a Jacobian 
which splits as the product of two nonisogenous elliptic curves without complex 
multiplication. This distribution also arises for some genus 2 curves with simple 
Jacobians, including 

y 2 = x 5 - x 4 + x 3 + x 2 - 2x + 1. 

This is a modular genus 2 curve which appears as Ciss.A hi [17], along with many 
similar examples. 

We speculate that simple Jacobians with Distribution #2 are all of type 1(2) 
in the classification of Moonen and Zarhin [39] , corresponding to Jacobians whose 
endomorphism ring is isomorphic to the ring of integers in a real quadratic extension 
of Q. A similar phenomenon occurs with Distribution #11, which arises for split 
Jacobians that are isogenous to the product of an elliptic curve and its twist, but 
also for simple Jacobians of type 11(1) in the Moonen-Zarhin classification (these 
are QM-curves, see [22] for examples). The remaining two types of simple genus 
2 Jacobians in the Moonen-Zarhin classification are type 1(1), which is the typical 
case (Jacobians with endomorphism ring Z), and type IV(2,1), which occurs for 
curves with complex multiplication over a quartic CM field (with no imaginary 
quadratic subficld). These correspond to Distributions #1 and #19 respectively 
(examples of the latter can be found in [54]). The remaining distributions appear 
to arise only for curves with split Jacobians. 

7. Representation of Genus 2 Distributions in USp(4) 

Conjecture 2 implies that each distribution in Table 11 is represented by the 
distribution of characteristic polynomials in some infinite compact subgroup H of 
USp(A). In this section we will exhibit such an H for each distribution. We do 
not claim that each H we give is the "correct" subgroup for every curve with the 
corresponding distribution, in the sense of corresponding to the Galois image Gi , 
as discussed in Section 3. 12 Rather, for each H we show that its density of zero 
traces and moment sequence are compatible with the corresponding data in Table 
11. In most cases we also have evidence that suggests H is the correct subgroup 
for the particular corresponding curve listed in Table 11 (see Section 7.1). 

For all but two cases we will construct H using two subgroups of U Sp(2) that 
represent distributions of genus 1 curves: G\ = U Sp(2) for an elliptic curve without 
complex multiplication, and G2 = N(SO(2)), the normalizer of 50(2) in SU(2), for 
an elliptic curve with complex multiplication. We will construct each H c USp(4) 
from G\ and/or G2 explicitly as a group of matrices, however the motivation behind 
these constructions are split Jacobians. 



Indeed, a single H for each distribution would not suffice. As noted above, two curves may 
share the same L P (T) distribution for different reasons. 
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The most obvious cases correspond to the product of two nonisogenous elliptic 
curves, and we have the groups G\ x G\, G\ x G2, and G2 X G2 as groups of block 
diagonal 2x2 matrices in USp(4). These correspond to Distributions #2, #3, 
and #8 in Table 11, and their moment sequences are easily computed via binomial 
convolutions of the appropriate genus 1 moment sequences (the example of the 
previous section corresponds to Gi x G2). 

To obtain additional distributions, we also consider the product of two isogc- 
nous elliptic curves. We may, for example, pair an elliptic curve with an isomorphic 
copy of itself, or with one of its twists. 13 For two isomorphic curves, the correspond- 
ing subgroup Hi contains block diagonal matrices of the form 




where A is an element of (i = 1 or 2). To pair a curve with its twist we also 
include block diagonal matrices with A and —A on the diagonal to obtain the 
subgroup ffr. 

We now generalize this idea. Let G — G\ (resp. G2) be a compact subgroup 
of USp(2), and let G* be the subgroup of U(2) obtained by extending G by scalars 
and taking the subgroup of elements whose determinants are fcth roots of unity (for 
some positive integer k). For A £ G* , let A denote the complex conjugate of A and 
define the block diagonal matrix 

B = 

The matrix B is clearly unitary, and one easily verifies that it is also symplectic, 
hence B € USp(A). The set of all such B forms our subgroup H. As a topological 
group, H has k (resp. 2k) connected components, each a closed set consisting of 
elements with A having a fixed determinant, thus H is compact. The identity 
component is isomorphic to USp(2) (resp. 50(2)), embedded diagonally. 

We may write A e G* as A = u> j A with A Q g G C USp(2). u> a primitive 2fcth 
root unity, and 1 < j < k. We then have 

tr(B) = tr(A) + tr(A) = u j tr(A ) + LJ~ j tr(A ) = (w + cj~ J ')tr(A ), 

since ti(A ) = tr(^4 ) for any A G USp(2). It follows that 




(25) E ff [(tr(B))»] = U py + w- 3 ')"j E G [(tr(A))) B ] 

where Eg [AT] denotes the expectation of a random variable X over the Haar measure 
on G. 

The term (uP +w~ : ' )™EG[(tr(^4o))™] corresponds to the nth moment of the trace 
distribution on a component of H. These moments will all be integers precisely 
when k £ {1, 2, 3, 4, 6} (they are zero for n odd). In fact, these are the only values 
of k for which H plausibly represents the distribution of L p (T) for a genus 2 curve 
defined over Q, as we now argue. 



13 See [7, Ch. 14] and [33] for explicit methods of constructing such curves. 
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H 


k = 1 


k = 2 


k = 3 A 


; = 4 


fc = 6 


H\ 


5 


11 


4 


7 


G 




11 


17 


10 


1G 


14 


J7* 


13 


21 


12 


18 


15 


J(H$) 


21 


23 


20 


22 


* 




Table 12. 


Distributions 


matching i? 4 fc 


or J(#*). 





Consider an L-polynomial L P (T) for which L p (T) = L p (p X I 2 T) is the charac- 
teristic polynomial of some B E H . We may factor L p (T) as 

(26) L p (T) = ( X (p)pT 2 -aT+ l){x{ P )pT 2 -aT+ 1), 

where a G with £ a primitive fcth root of unity, and x(p) 1S a kih root of unity 
equal to the determinant of A in the component of H containing B. For the elliptic 
curve we have in mind as a factor of the Jacobian, x(p)P is the determinant of its 
Frobenius element and a is the trace. Since L P (T) has integer coefficients, a must 
lie in a quadratic extension of Q, giving k G {1, 2, 3, 4, 6}. 

For each of these k we can easily compute closed forms for the parenthesized 
expression in (25). Assuming n > is even, we obtain 

2", 272, (2™ + 2)/3, (2" + 2"/ 2+1 )/4, (2" + 2 ■ T /2 + 2)/6, 

for fc = 1,2,3,4,6, respectively. Since Ec[(tr(Ao))"] is zero for odd values of n, 
these expressions may be used in (25) for all positive n. For even values of k, the 
density z(H) of zero traces in H is 1/fc (rcsp. 1/2 + 1/k), and z{H) is zero (resp. 
1/2) for k odd. 

For any of the subgroups H constructed as above, we may also consider the 
group J{H) generated by H and the block diagonal matrix 

(27) J = 

The group J(H) contains H as a subgroup of index 2, with the nonidentity coset 
having all zero traces. For n > 0, the nth moment of the trace distribution in J(H) 
is simply half that of H, and z{ J{H)) = (z(H) + l)/2. 

For i = 1 or 2 and k G {1,2,3,4,6}, let denote the group H constructed 
using Gi and k. By also considering J(iJ 4 fc ), we can construct a total of 20 groups, 18 
of which have distinct eigenvalue distributions. With the sole exception of J(iJ|), 
each of these matches one of the distributions in Table 11 to a proximity well within 
the accuracy of our computational methods. Note that Hj = Hi, and the group 
Hf has the same eigenvalue distribution as H~ (but is not conjugate). 

We may also consider J(G\ xGi), which corresponds to Distribution #9. This 
construction docs not readily apply to G\ x G2 (in fact J{G\ x G%) = J{G\ x 
Gi)). The group J(G2 x G2) does give a new distribution (it is the normalizcr 
of SO(2) x SO(2) in [7S'p(4)), but it does not correspond to any we have found. 
However, the group J(G*2 x G2) contains a subgroup K not equal to G2 x G2 which 
matches Distribution #19. The group K has identity component SO (2) x 5*0(2) 
and a cyclic component group, of order 4 (this determines K). 
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Of all the groups we have constructed, only K does not correspond, in some 
fashion, to a split Jacobian. As noted above, Distribution #19 arises for curves 
with simple Jacobians and complex multiplication over a quartic CM field, and in 
fact all nineteen such curves documented by van Wamclcn have this distribution 
[54, 55]. The only remaining group to consider is USp(A) itself, which of course 
gives Distribution #1. 

Table 13 gives a complete list of the subgroups of U Sp(4) we have identified, 
one for each distribution in Table 11, and also entries for J(Pf ) and J(G2 x G2). 
These last two distributions appear to be spurious; see § 7.2. 

7.1. Supporting Evidence. Aside from closely matching the trace distri- 
butions we have found, there is additional data that supports our choice of the 
subgroups H appearing in Tabic 13. First, we find that these H not only match 
the distribution of the a\ coefficient in L p (T), they also appear to give the correct 
distribution of a-i- We should note that for the three curves in Tabic 11 where f(x) 
has degree 6, the available methods for computing L p (T) are much less efficient, so 
we did not attempt this verification in these three cases. 14 

More significantly, the disconnected groups in Table 13 also appear to give the 
correct distribution of a\ (and for f{x) of degree 5) on each of their components. 
If we partition the components of H according to their distributions of characteristic 
polynomials, for a given curve we can typically find a partitioning of primes into 
sets of corresponding density with matching L p (T) distributions. 

Taking Distribution #10 as an example, we have H = J(H 3 ) in Table 13 and 
the corresponding curve in Tabic 11 is given by 

y 2 = f(x) = x 5 - 2.t 4 - 3a; 3 + 2x 2 + 8x = x(x - 2)(x 3 - 3x - 4). 

The cubic g(x) = x 3 — 3x — 4 has Galois group S3. The set of primes P3 for 
which g(x) splits into three factors in ¥ p [x] has density 1/6, and corresponds to the 
identity component of J(H 3 ). 15 The set of primes P2 where g{x) splits into two 
factors has density 1/2, and corresponds to the nonidentity coset of Hf in J(Hf), 
containing three components with identical eigenvalue distributions. The remaining 
set of primes Pi for which g(x) is irreducible has density 1/3 and corresponds to 
the set of elements of H 3 for which the determinant of the block diagonal matrix 
A is not 1 (this includes two of the six components of J(H 3 )). Table 14 lists the 
moment statistics for a\ and a2, restricted to the sets Pi, P2, and P3, with values 
for the corresponding subset of J{H 3 ) beneath. 

A similar analysis can be applied to the other disconnected groups in Table 13, 
however the definition of the sets Pi varies. For Distribution #11 the group has 
two components, and for the curve y 2 = f(x) — x 5 + x 3 = x the correct partitioning 
of primes simply depends on the value of p modulo 4, not on how f{x) splits in 
¥ p [x] (in fact, the set of primes where f(x) splits intersects both partitions). In 
other cases both a modular constraint and a splitting condition may apply. In 
general there is some partitioning of primes which corresponds to a partitioning of 
the components of H, and the corresponding distributions appear to agree. 

In addition to verifying the distribution of L p (T) over sets of primes, for each 
group corresponding to a split Jacobian, one can also check whether L p (T) admits 



The a\ coefficient can be computed reasonably efficiently in the degree 6 case by counting points 
on C in ¥ p , but the group law on the Jacobian is much slower. 
"^We thank Dan Bump for suggesting this approach. 
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# 


H 


d 


c(H) 


z(H) 


M 2 


M 4 


M 6 


M 8 


Afio 


1 


US P {A) 


10 


1 





1 


3 


14 


84 


594 


2 


d x Gi 


G 


1 





2 


10 


70 


588 


5544 


3 


Gx x G 2 


4 


2 





2 


11 


90 


889 


9723 


4 


Hi 


3 


3 





2 


12 


110 


1204 


14364 


5 


Hi 


3 


1 





4 


32 


320 


3584 


43008 


6 


Hf 


3 


G 


1/6 


2 


12 


100 


980 


10584 


7 


Ht 


3 


4 


1/4 


2 


12 


100 


1008 


11424 


8 


G 2 x G 2 


2 


4 


1/4 


2 


12 


110 


1260 


16002 


9 


J(d x Gi) 


G 


2 


1/2 


1 


5 


35 


294 


2772 


10 




3 


G 


1/2 


1 


G 


55 


602 


7182 


11 


#r 


3 


2 


1/2 


2 


1G 


160 


1792 


21504 


12 


Hi 


1 


G 


1/2 


2 


18 


220 


3010 


43092 


13 


H 2 


1 


2 


1/2 


4 


48 


640 


8960 


129024 


14 


J {Hf) 


3 


12 


7/12 


1 


G 


50 


490 


5292 


15 


HI 


1 


12 


7/12 


2 


18 


200 


2450 


31752 


16 


J(Hf) 


3 


8 


5/8 


1 


G 


50 


504 


5712 


17 


Hi 


1 


8 


5/8 


2 


18 


200 


2520 


34272 


18 


J(H-) 


3 


4 


3/4 


1 


8 


80 


896 


10752 


19 


K 


2 


4 


3/4 


1 


9 


100 


1225 


15876 


20 


J(H$) 


1 


12 


3/4 


1 


9 


110 


1505 


21546 


21 


h; 


1 


4 


3/4 


2 


24 


320 


4480 


64512 


22 


J(H%) 


1 


16 


13/16 


1 


9 


100 


1260 


17136 


23 


J (Hz) 


1 


8 


7/8 


1 


12 


160 


2240 


32256 


* 


J(G 2 x G 2 ) 


2 


8 


5/8 


1 


G 


55 


630 


8001 


* 


J(Hl) 


1 


24 


19/24 


1 


9 


100 


1225 


15876 



Table 13. Candidate subgroups of USp(4). 



Row numbers correspond to the distributions in Table 11. Column d lists the real dimen- 
sion of H , and c(H) counts its components. The column z(H) gives the density of zero 
traces, and M n — Eh [(tr(B))™] for a random B £ H. The last two rows are not known to 
match the L-polynomial distribution of a genus 2 curve. 

a factorization of the expected type. For the product groups we expect L p {T) to 
factor into two quadratics with coefficients in Z. For the groups H\ we expect a 
factorization of the form given in (26), and a similar form applies to Hf . 16 For 
each curve in Table 11 with f(x) of degree 5 corresponding to such a group, we 



As previously noted, there are curves with simple Jacobians matching distributions #2 and #11 
for which this would not apply, but they don't appear in Table 11. 
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Set 


X 


Mi 


Ah 


Ah 


Ah 


M 5 


Ah 


M 7 


Ah 


Pi 


a 1 


0.000 


1.001 


-0.002 


2.002 


-0.007 


5.005 


-0.027 


14.01 









1 





2 





5 





14 




a 2 


0.001 


1.001 


1.001 


3.002 


6.005 


15.01 


36.03 


91.10 









1 


1 


3 


6 


15 


36 


91 


P2 


ai 


0.000 


0.000 


0.000 


0.000 


0.000 


0.000 


0.000 


0.000 
































a 2 


1.000 


2.010 


3.001 


6.003 


10.00 


20.01 


35.02 


70.05 






1 


2 


3 


6 


10 


20 


35 


70 


P3 


Ol 


-0.002 


3.999 


-0.045 


31.95 


-0.590 


319.3 


-7.69 


3574 









4 





32 





320 





3584 




o 2 


2.999 


9.995 


36.97 


149.8 


652.7 


3005 


14404 


71160 






3 


10 


37 


150 


654 


3012 


14445 


71398 



Table 14. Component distributions of ai and <22 for curve #10. 



verified the existence of the expected factorization for primes p < 10 6 . For groups 
with multiple components, we partitioned the primes appropriately (note that for 
groups of the form J{H) we do not expect a factorization for primes corresponding 
to components with off-diagonal elements). 

The hnal piece of evidence we present is much less precise, and of an entirely 
different nature. By computing the group structure of the Jacobian and determining 
the rank of its £-Sylow subgroup for many p, one can estimate the size of the mod 
I Galois image G[£] in GSp(2g, TLjPV). The primes for which the ^-Sylow subgroup 
has rank 2g correspond to the identity element in G[£). The group GSp(2g,Z/£Z) 
has size 0(£ 11 ), and this corresponds to the real dimension of USp(2g) which is 10 
(the reduction in dimension arises from unitarization). For all but the first group 
in Table 13. the real dimension is at most 6. We should expect correspondingly 
small G[£], at most 0{£ 7 ) in size. By computing 

, 98 n . _ log(#G[li])-log(#G[l 2 ]) 
(28) d log^-logfc 

for various £\ ^ £2 we obtain a general estimate for #G[£] = 0(£ d ). The value 
— 1 is then an estimate of the real dimension of H. This is necessarily a rather 
crude approximation, and one must take care to avoid exceptional values of £. We 
performed this computation for each exceptional curve in Table 11 where f(x) has 
degree 5 for p < N = 2 24 and £ ranging from 3 to 19. The results agreed with the 
corresponding dimensions in Table 11 to within ±1. 

7.2. Nonexistence Arguments. We wish to thank Jean-Pierre Serre (pri- 
vate communication) for suggesting the following argument to rule out the case 
H = J(G 2 x G2). Let E be the endomorphism algebra of the Jacobian of the curve 
over Q. Then E<S)Q must be a commutative Q-algcbra of rank 4; more precisely, it 
must be either K Cg>Q K' for some nonisomorphic imaginary quadratic fields K, K', 
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or a quartic CM-field. (We cannot have K = K 1 or else H would be forced to have 
dimension 1 rather than 2.) In either case, Aut(-E <g> Q) has at most 4 elements, so 
the elements of H arc defined over a field of degree at most 4 over Q. However, 
J(G2 x G2) has 8 connected components, so this is impossible. 

We do not yet have an analogous argument to rule out H = J(7?|)- However, 
Serre suggests that it should be possible to give such an argument based on the 
following data: the algebra E <8> Q must be a (possibly split) quaternion algebra 
over a quadratic field, and the image of Galois in A\xt{E <£> Q) must have order 24. 

8. Conclusion 

Based on the results presented in Section 6, we now state a more explicit form 
of Conjecture 2 for genus 2 curves. 

Conjecture 3. Let C be a genus 2 curve. The distribution of L p (T) over 
p < N converges (as N — > 00) to the distribution 0/ %(T) in one of the first 23 
subgroups of U Sp(4) listed in Table 13. For almost all curves, this group is USp{A). 

It would be interesting to carry out a similar analysis in genus 3, but it is not 
immediately clear from the genus 2 results how many exceptional distributions one 
should expect. An exhaustive search of the type undertaken in genus 2 may not be 
computationally feasible in genus 3. 

We end by once again thanking Nicholas Katz for his invaluable support through- 
out this project, and David Vogan for several helpful conversations. We also thank 
Zeev Rudnick for his feedback on an early draft of this paper, and Jean-Pierre Serre 
for his remarks in § 7.2. 
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9. Appendix I - Distributions of Sk 

This appendix is a gallery of distributions of Sk for hyperclliptic curves of 
genus 1, 2, and 3 with large Galois image. The Sk are the kth power sums (Newton 
symmetric function) of the roots of the unitarized L-polynomial L p (T) defined in 
Section 2. 

Each figure represents a histogram of approximately 7r(iV) values derived from 
L p (T) for p < N where the curve has good reduction. The horizontal axis ranges 
from —(*jf) to ( 2 fc 9 ), divided into approximately y/n(N) buckets. The vertical axis 
is scaled to fit the data, with the height of the uniform distribution indicated by a 
dotted line. 

9.1. Genus 1. N = 2 35 ; if = x 3 + 314159a; + 271828. 
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10. Appendix II - Distributions of a\ in Genus 2 

This appendix contains a± histograms for each of the first 12 curves listed in 
Table 11, computed with TV = 2 26 . The approximate area of the central spike is 
given by z(C) in Table 11, corresponding to primes for which a p — 0. The secondary 
spikes at ai = ±2 appearing in distributions #8 and #12 have approximately zero 
area. 



#1 
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